Overview

Dataset Statistics

Number of Variables 25
Number of Rows 303
Missing Cells 199
Missing Cells (%) 2.6%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 305.9 KB
Average Row Size in Memory 1.0 KB
Variable Types
  • Categorical: 14
  • Numerical: 6
  • GeoGraphy: 1
  • DateTime: 4

Dataset Insights

EmpID is uniformly distributed Uniform
Salary is skewed Skewed
Zip is skewed Skewed
EngagementSurvey is skewed Skewed
Employee_Name has a high cardinality: 303 distinct values High Cardinality
State has constant length 2 Constant Length
EmpSatisfaction has constant length 1 Constant Length
Employee_Name has all distinct values Unique

Variables


Employee_Name

categorical

Approximate Distinct Count 303
Approximate Unique (%) 100.0%
Missing 0
Missing (%) 0.0%
Memory Size 24160

Length

Mean 14.736
Standard Deviation 2.7045
Median 15
Minimum 8
Maximum 25

Sample

1st row Adinolfi, Wilson ...
2nd row Ait Sidi, Karthike...
3rd row Akinkuolie, Sarah
4th row Alagbe,Trina
5th row Anderson, Carol

Letter

Count 3722
Lowercase Letter 3090
Space Separator 434
Uppercase Letter 632
Dash Punctuation 3
Decimal Number 0

EmpID

numerical

Approximate Distinct Count 303
Approximate Unique (%) 100.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 4848
Mean 10156.4092
Minimum 10001
Maximum 10311
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • EmpID is uniformly distributed
  • EmpID is skewed left (γ1 = -0.0038)

Quantile Statistics

Minimum 10001
5-th Percentile 10017.1
Q1 10079.5
Median 10157
Q3 10234.5
95-th Percentile 10295.9
Maximum 10311
Range 310
IQR 155

Descriptive Statistics

Mean 10156.4092
Standard Deviation 90.1236
Variance 8122.2691
Sum 3.0774e+06
Skewness -0.00379
Kurtosis -1.207
Coefficient of Variation 0.008874

Salary

numerical

Approximate Distinct Count 300
Approximate Unique (%) 99.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 4848
Mean 69292.3168
Minimum 45046
Maximum 250000
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Salary is skewed right (γ1 = 3.2513)

Quantile Statistics

Minimum 45046
5-th Percentile 47022
Q1 55633
Median 62910
Q3 72331
95-th Percentile 108810.9
Maximum 250000
Range 204954
IQR 16698

Descriptive Statistics

Mean 69292.3168
Standard Deviation 25406.0928
Variance 6.4547e+08
Sum 2.0996e+07
Skewness 3.2513
Kurtosis 14.8031
Coefficient of Variation 0.3667
  • Salary is not normally distributed (p-value 1.9784425993380533e-10)
  • Salary has 29 outliers

Position

categorical

Approximate Distinct Count 32
Approximate Unique (%) 10.6%
Missing 0
Missing (%) 0.0%
Memory Size 25951
  • The largest value (Production Technician I) is over 2.51 times larger than the second largest value (Production Technician II)

Length

Mean 20.6469
Standard Deviation 4.1865
Median 23
Minimum 3
Maximum 28

Sample

1st row Production Technic...
2nd row Sr. DBA
3rd row Production Technic...
4th row Production Technic...
5th row Production Technic...

Letter

Count 5704
Lowercase Letter 4787
Space Separator 538
Uppercase Letter 917
Dash Punctuation 4
Decimal Number 0
  • The top 2 categories (Production Technician I, Production Technician II) take over 50.0%

State

categorical

Approximate Distinct Count 28
Approximate Unique (%) 9.2%
Missing 0
Missing (%) 0.0%
Memory Size 20301
  • The largest value (MA) is over 44.67 times larger than the second largest value (CT)

Length

Mean 2
Standard Deviation 0
Median 2
Minimum 2
Maximum 2

Sample

1st row MA
2nd row MA
3rd row MA
4th row MA
5th row MA

Letter

Count 606
Lowercase Letter 0
Space Separator 0
Uppercase Letter 606
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (MA, CT) take over 50.0%
  • The largest value (ma) is over 44.67 times larger than the second largest value (ct)
  • State has words of constant length

Zip

numerical

Approximate Distinct Count 155
Approximate Unique (%) 51.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 4848
Mean 6673.5446
Minimum 1040
Maximum 98052
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Zip is skewed right (γ1 = 4.0239)

Quantile Statistics

Minimum 1040
5-th Percentile 1730
Q1 1901.5
Median 2130
Q3 2359.5
95-th Percentile 39910.9
Maximum 98052
Range 97012
IQR 458

Descriptive Statistics

Mean 6673.5446
Standard Deviation 17114.8607
Variance 2.9292e+08
Sum 2.0221e+06
Skewness 4.0239
Kurtosis 15.3802
Coefficient of Variation 2.5646
  • Zip is not normally distributed (p-value 4.498715323652441e-25)
  • Zip has 35 outliers

DOB

datetime

Distinct Count 299.6842
Approximate Unique (%) 98.9%
Missing 0
Missing (%) 0.0%
Memory Size 2552
Minimum 1951-01-02 00:00:00
Maximum 1992-08-17 00:00:00

Sex

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.7%
Missing 0
Missing (%) 0.0%
Memory Size 21381

Length

Mean 5.5644
Standard Deviation 0.4967
Median 6
Minimum 5
Maximum 6

Sample

1st row Male
2nd row Male
3rd row Female
4th row Female
5th row Female

Letter

Count 1554
Lowercase Letter 1251
Space Separator 132
Uppercase Letter 303
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Female, Male ) take over 50.0%

MaritalDesc

categorical

Approximate Distinct Count 5
Approximate Unique (%) 1.7%
Missing 0
Missing (%) 0.0%
Memory Size 21736

Length

Mean 6.736
Standard Deviation 0.7824
Median 7
Minimum 6
Maximum 9

Sample

1st row Single
2nd row Married
3rd row Married
4th row Married
5th row Divorced

Letter

Count 2041
Lowercase Letter 1738
Space Separator 0
Uppercase Letter 303
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Single, Married) take over 50.0%

CitizenDesc

categorical

Approximate Distinct Count 3
Approximate Unique (%) 1.0%
Missing 0
Missing (%) 0.0%
Memory Size 22837
  • The largest value (US Citizen) is over 23.92 times larger than the second largest value (Eligible NonCitizen)

Length

Mean 10.3696
Standard Deviation 1.7592
Median 10
Minimum 10
Maximum 19

Sample

1st row US Citizen
2nd row US Citizen
3rd row US Citizen
4th row US Citizen
5th row US Citizen

Letter

Count 2839
Lowercase Letter 1934
Space Separator 299
Uppercase Letter 905
Dash Punctuation 4
Decimal Number 0
  • The top 2 categories (US Citizen, Eligible NonCitizen) take over 50.0%

HispanicLatino

categorical

Approximate Distinct Count 4
Approximate Unique (%) 1.3%
Missing 0
Missing (%) 0.0%
Memory Size 20329
  • The largest value (No) is over 10.15 times larger than the second largest value (Yes)

Length

Mean 2.0924
Standard Deviation 0.2901
Median 2
Minimum 2
Maximum 3

Sample

1st row No
2nd row No
3rd row No
4th row No
5th row No

Letter

Count 634
Lowercase Letter 333
Space Separator 0
Uppercase Letter 301
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (No, Yes) take over 50.0%
  • The largest value (no) is over 9.79 times larger than the second largest value (yes)

RaceDesc

categorical

Approximate Distinct Count 6
Approximate Unique (%) 2.0%
Missing 0
Missing (%) 0.0%
Memory Size 23006
  • The largest value (White) is over 2.32 times larger than the second largest value (Black or African American)

Length

Mean 10.9274
Standard Deviation 9.05
Median 5
Minimum 5
Maximum 32

Sample

1st row White
2nd row White
3rd row White
4th row White
5th row White

Letter

Count 3029
Lowercase Letter 2559
Space Separator 282
Uppercase Letter 470
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (White, Black or African American) take over 50.0%
  • The largest value (white) is over 2.23 times larger than the second largest value (american)

DateofHire

datetime

Distinct Count 99.0749
Approximate Unique (%) 32.7%
Missing 0
Missing (%) 0.0%
Memory Size 2552
Minimum 2006-01-09 00:00:00
Maximum 2018-07-09 00:00:00

DateofTermination

datetime

Distinct Count 97.0719
Approximate Unique (%) 93.3%
Missing 199
Missing (%) 65.7%
Memory Size 2552
Minimum 2010-08-30 00:00:00
Maximum 2018-11-10 00:00:00

TermReason

categorical

Approximate Distinct Count 18
Approximate Unique (%) 5.9%
Missing 0
Missing (%) 0.0%
Memory Size 24394
  • The largest value (N/A-StillEmployed) is over 9.95 times larger than the second largest value (Another position)

Length

Mean 15.5083
Standard Deviation 3.944
Median 17
Minimum 5
Maximum 32

Sample

1st row N/A-StillEmployed
2nd row career change
3rd row hours
4th row N/A-StillEmployed
5th row return to school

Letter

Count 4192
Lowercase Letter 3374
Space Separator 94
Uppercase Letter 818
Dash Punctuation 210
Decimal Number 0
  • The top 2 categories (N/A-StillEmployed, Another position) take over 50.0%
  • The largest value (nastillemployed) is over 9.95 times larger than the second largest value (position)

EmploymentStatus

categorical

Approximate Distinct Count 3
Approximate Unique (%) 1.0%
Missing 0
Missing (%) 0.0%
Memory Size 23145
  • The largest value (Active) is over 2.26 times larger than the second largest value (Voluntarily Terminated)

Length

Mean 11.3861
Standard Deviation 7.4749
Median 6
Minimum 6
Maximum 22

Sample

1st row Active
2nd row Voluntarily Termin...
3rd row Voluntarily Termin...
4th row Active
5th row Voluntarily Termin...

Letter

Count 3330
Lowercase Letter 2923
Space Separator 120
Uppercase Letter 407
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Active, Voluntarily Terminated) take over 50.0%
  • The largest value (active) is over 1.91 times larger than the second largest value (terminated)

Department

categorical

Approximate Distinct Count 6
Approximate Unique (%) 2.0%
Missing 0
Missing (%) 0.0%
Memory Size 23870
  • The largest value (Production ) is over 4.02 times larger than the second largest value (IT/IS)

Length

Mean 13.7789
Standard Deviation 5.3871
Median 17
Minimum 5
Maximum 20

Sample

1st row Production
2nd row IT/IS
3rd row Production
4th row Production
5th row Production

Letter

Count 2697
Lowercase Letter 2223
Space Separator 1428
Uppercase Letter 474
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Production , IT/IS) take over 50.0%
  • The largest value (production) is over 4.02 times larger than the second largest value (itis)

ManagerName

categorical

Approximate Distinct Count 21
Approximate Unique (%) 6.9%
Missing 0
Missing (%) 0.0%
Memory Size 23522

Length

Mean 12.6304
Standard Deviation 2.1658
Median 13
Minimum 8
Maximum 18

Sample

1st row Michael Albert
2nd row Simon Roup
3rd row Kissy Sullivan
4th row Elijiah Gray
5th row Webster Butler

Letter

Count 3508
Lowercase Letter 2888
Space Separator 312
Uppercase Letter 620
Dash Punctuation 0
Decimal Number 0

ManagerID

numerical

Approximate Distinct Count 23
Approximate Unique (%) 7.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 4848
Mean 14.571
Minimum 1
Maximum 39
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • ManagerID is skewed right (γ1 = 0.7555)

Quantile Statistics

Minimum 1
5-th Percentile 2
Q1 10
Median 15
Q3 19
95-th Percentile 22
Maximum 39
Range 38
IQR 9

Descriptive Statistics

Mean 14.571
Standard Deviation 8.0783
Variance 65.259
Sum 4415
Skewness 0.7555
Kurtosis 1.5623
Coefficient of Variation 0.5544
  • ManagerID is not normally distributed (p-value 0.009045378934798308)
  • ManagerID has 13 outliers

RecruitmentSource

categorical

Approximate Distinct Count 9
Approximate Unique (%) 3.0%
Missing 0
Missing (%) 0.0%
Memory Size 22857

Length

Mean 10.4356
Standard Deviation 4.4344
Median 8
Minimum 5
Maximum 23

Sample

1st row LinkedIn
2nd row Indeed
3rd row LinkedIn
4th row Indeed
5th row Google Search

Letter

Count 3024
Lowercase Letter 2490
Space Separator 137
Uppercase Letter 534
Dash Punctuation 1
Decimal Number 0
  • The top 2 categories (Indeed, LinkedIn) take over 50.0%

PerformanceScore

categorical

Approximate Distinct Count 4
Approximate Unique (%) 1.3%
Missing 0
Missing (%) 0.0%
Memory Size 22888
  • The largest value (Fully Meets) is over 6.56 times larger than the second largest value (Exceeds)

Length

Mean 10.538
Standard Deviation 2.5678
Median 11
Minimum 3
Maximum 17

Sample

1st row Exceeds
2nd row Fully Meets
3rd row Fully Meets
4th row Fully Meets
5th row Fully Meets

Letter

Count 2939
Lowercase Letter 2356
Space Separator 254
Uppercase Letter 583
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Fully Meets, Exceeds) take over 50.0%

EngagementSurvey

numerical

Approximate Distinct Count 118
Approximate Unique (%) 38.9%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 4848
Mean 4.1061
Minimum 1.12
Maximum 5
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • EngagementSurvey is skewed left (γ1 = -1.111)

Quantile Statistics

Minimum 1.12
5-th Percentile 2.4
Q1 3.675
Median 4.28
Q3 4.7
95-th Percentile 5
Maximum 5
Range 3.88
IQR 1.025

Descriptive Statistics

Mean 4.1061
Standard Deviation 0.7946
Variance 0.6314
Sum 1244.16
Skewness -1.111
Kurtosis 1.0995
Coefficient of Variation 0.1935
  • EngagementSurvey is not normally distributed (p-value 6.0830515442337406e-15)
  • EngagementSurvey has 9 outliers

EmpSatisfaction

categorical

Approximate Distinct Count 5
Approximate Unique (%) 1.7%
Missing 0
Missing (%) 0.0%
Memory Size 19998

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 5
2nd row 3
3rd row 3
4th row 5
5th row 4

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 303
  • The top 2 categories (3, 5) take over 50.0%
  • EmpSatisfaction has words of constant length

LastPerformanceReview_Date

datetime

Distinct Count 135.1392
Approximate Unique (%) 44.6%
Missing 0
Missing (%) 0.0%
Memory Size 2552
Minimum 2010-07-14 00:00:00
Maximum 2019-02-28 00:00:00

Absences

numerical

Approximate Distinct Count 20
Approximate Unique (%) 6.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 4848
Mean 10.2673
Minimum 1
Maximum 20
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Absences is skewed right (γ1 = 0.0198)

Quantile Statistics

Minimum 1
5-th Percentile 2
Q1 5
Median 10
Q3 15
95-th Percentile 19
Maximum 20
Range 19
IQR 10

Descriptive Statistics

Mean 10.2673
Standard Deviation 5.8839
Variance 34.6204
Sum 3111
Skewness 0.01984
Kurtosis -1.3059
Coefficient of Variation 0.5731
  • Absences is not normally distributed (p-value 0.000748168144802888)

Interactions

Correlations

Missing Values